SD-Map - A Fast Algorithm for Exhaustive Subgroup Discovery

نویسندگان

  • Martin Atzmüller
  • Frank Puppe
چکیده

In this paper we present the novel SD-Map algorithm for exhaustive but efficient subgroup discovery. SD-Map guarantees to identify all interesting subgroup patterns contained in a data set, in contrast to heuristic or samplingbased methods. The SD-Map algorithm utilizes the well-known FP-growth method for mining association rules with adaptations for the subgroup discovery task. We show how SD-Map can handle missing values, and provide an experimental evaluation of the performance of the algorithm using synthetic data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Knowledge-intensive subgroup mining: techniques for automatic and interactive discovery

Data mining has proved its significance in various domains and applications. As an important subfield of the general data mining task, subgroup mining can be used, e.g., for marketing purposes in business domains, or for quality profiling and analysis in medical domains. The goal is to efficiently discover novel, potentially useful and ultimately interesting knowledge. However, in real-world si...

متن کامل

Knowledge Discovery, Data Mining and Machine Learning Editors

Subgroup mining is a flexible data mining method that considers a given target variable and aims to discover interesting subgroups with respect to this property of interest. In this paper, we especially focus on the handling of continuous target variables: We propose novel formalizations of effective pruning strategies for reducing the search space, and we present the SD-Map* algorithm that ena...

متن کامل

Any-time Diverse Subgroup Discovery with Monte Carlo Tree Search

The discovery of patterns that accurately discriminate one class label from another remains a challenging data mining task. Subgroup discovery (SD) is one of the frameworks that enables to elicit such interesting hypotheses from labeled data. A question remains fairly open: How to select an accurate heuristic search technique when exhaustive enumeration of the pattern space is infeasible? Exist...

متن کامل

Fast Description-Oriented Community Detection using Subgroup Discovery

Communities can intuitively be defined as subsets of nodes of a graph with a dense structure. However, for mining such communities usually only structural aspects are taken into account. Typically, no concise and easily interpretable community description is provided. For tackling this issue, we focus on fast description-oriented community detection using subgroup discovery, cf. [1, 2]. In orde...

متن کامل

Generic Pattern Trees for Exhaustive Exceptional Model Mining

Exceptional model mining has been proposed as a variant of subgroup discovery especially focusing on complex target concepts. Currently, efficient mining algorithms are limited to heuristic (non exhaustive) methods. In this paper, we propose a novel approach for fast exhaustive exceptional model mining: We introduce the concept of valuation bases as an intermediate condensed data representation...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006